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ABSTRACT 



We demonstrate the use of a variant of Principal Component Analysis (PCA) 
for discrimination problems in astronomy. This variant of PCA is shown to 
provide the best linear discrimination between data classes. As a test case, we 
present the problem of discrimination between K giant and K dwarf stars from 
intermediate resolution spectra near the Mg 'b' feature. The discrimination 
procedure is trained on a set of 24 standard K giants and 24 standard K dwarfs, 
and then used to perform giant - dwarf classification on a sample of ~ 1500 
field K stars of unknown luminosity class which were initially classified visually. 
For the highest S/N spectra, the automated classification agrees very well (at 
the 90 - 95% level) with the visual classification. Most importantly, however, 
the automated method is found to classify stars in a repeatable fashion, and, 
according to numerical experiments, is very robust to signal to noise (S/N) 
degradation. 



Subject headings: Numerical techniques, stellar classification, K stars 
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1. Introduction 

Studies of the large scale kinematic and chemical structure of our Galaxy often 
investigate the properties of samples of stars that are believed to trace Galactic mass. One 
of the most used species is the K giant ( e.g., Ibata & Gilmore 1995a, Rich 1990, Rich 
1988, Kuijken & Gilmore 1989, Lewis & Freeman 1989). These stars are particularly 
useful as they give a fair representation of the underlying stellar distribution (c/. e.g. Ibata 
& Gilmore 1995a). They have high intrinsic luminosity and well-behaved spectral features 
whose differences can be interpreted in terms of changes in temperature, abundance and 
surface gravity. However, local, intrinsically faint K dwarfs can contaminate these samples 
considerably. Fortunately, the sensitivity of absorption lines to the surface gravity of these 
stars allows one to discriminate between K giants and K dwarfs: this may be performed 
visually, by comparison to a grid of standards ( e.g., Kuijken & Gilmore 1989), or by 
minimizing a statistic constructed from the stellar spectra and a grid of synthetic standards 
( e.g., Cayrel et al. 1991a, 1991b). 

A number of alternative methods based on real stellar templates have been used for 
spectral classification including: Artificial Neural Networks ( e.g., von Hippel et al. 1994); 
minimum distance methods and assorted methods based on cross-correlation ( e.g., Kurtz 
1984). Neural networks can offer a very sophisticated non-linear combination of input 
parameters and can be thought of as a variant of non-linear least squares minimization 
closely tied to a Bayesian classification scheme. While cross-correlation and the closely 
related (weighted) minimum distance methods are straightforward variants of least-squares 
fitting to standard templates. In this note we demonstrate the use of a robust optimal 
linear discrimination scheme based on a variant of Principal Components Analysis. 

The data-set that is examined below was obtained with the AUTOFIB multi-fibers 
spectrograph at the Anglo Australian Telescope (AAT) with the aim of investigating the 
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kinematic and abundance structure of both the inner Milky Way ( Ibata & Gilmore 1995a, 
1995b), and the Sagittarius dwarf galaxy ( Ibata, Gilmore & Irwin 1994). Though this 
instrument was efficient in gathering large samples of spectroscopic data, the resulting 
spectra cannot be directly compared to flux-calibrated spectra, because the spectrograph 
induces large, low-frequency variations in the shape of the spectra over wavelength ranges 
of width typically ~ 200A This meant that the giant-dwarf discrimination technique 
of Carrel et al. (1991) could not be applied to the AUTOFIB data without considerable 
recalibration. 

Ibata & Gilmore (1995a) therefore initially classified their R3 3000 spectra visually, 
following the prescription detailed in Kuijken & Gilmore (1989). However, it was clearly 
desirable to design an automated algorithm that is repeatable, that classifies stars to lower 
signal to noise than is possible visually, and that allows an estimate of the certainty of the 
classification to be made. 

2. Visual Dwarf - Giant Classification for K stars 

The survey-star spectra were first compared empirically to a grid of K giant and K 
dwarf standards. The standard spectra were observed by Kuijken & Gilmore (1989) and 
Ibata & Gilmore (1995a), again with the AUTOFIB fibers system. Standards at several 
(B — V)o are presented in Figure 1; a list of these stars is given in Table 1. The field 
dwarfs and metal poor dwarfs (subdwarfs) are from Bessel & Wickramasinghe (1979) and 
Rodgers & Eggen (1974), the metal rich dwarfs (Hyades dwarfs) are from Pels et al. (1975) 
and Upgren & Weiss (1977), while the giants are from Yoss et al. (1981), Friel (1986) and 
Faber et al. (1985). The most striking features in these spectra are the three Mg'b' lines at 
(5167, 5173 and 5184 A) and the MgH band at 5211 A (which also belongs to the Mg'b' 
feature). The other prominent lines are mostly TiO, Fe I and Fe II. Several properties of 



- 5 - 



K star atmospheres can be seen in the grid. In dwarfs, the prominent MgH band (5211 
A) is seen after (B — V) £ 1.05, while in giants it appears only after (B — V) ^ 1.25. Fe 
lines are weaker even in super-metal-rich giants (cf. Table 1) than in Hyades dwarfs of the 
same color. For those stars with (B — V) > 1.1, the wide Mg'b' absorption band (a wide 
dip stretching 5050 A ^ A ^ 5200) is strong in dwarfs, but is weak until (B — V) ~ 1.3 in 
giants. 

Cayrel et al. (1991) calculate synthetic spectra to find the surface gravity dependence 
of a K star spectrum in the wavelength range 4800 to 5300A at fixed effective temperature 
and metallicity. In this situation they show that dwarfs display much stronger Mg, Fe and 
MgH lines than giants (because giants have lower surface gravity atmospheres and hence 
lower opacities). Cayrel et al. also calculate the metallicity dependence at constant surface 
gravity and effective temperature — as would be expected, higher metallicity increases the 
depth of the Mg and Fe lines and the MgH band (except for saturated Mg lines in metal 
poor dwarfs). Their results show clearly that the Mg'b' triplet and MgH band are more 
sensitive to gravity than to metallicity for stars of [Fe/H] £ — 1.25, and that these lines 
can be as weak in metal poor dwarfs as they are in giants. Fortunately, along the lines of 
sight to these survey stars, starcount galaxy models predict a negligible contribution (< 
0.01 % ) of metal poor dwarfs (foreground halo stars) in the samples ( Ibata 1994). 

K giants and K dwarfs were in this way visually classified by comparison to the 
standards in Figure 1. The spectra were also binned into four groups, a subjective ranking 
of the certainty of the classification. The giant-dwarf classification was deemed to be 
satisfactory for high S/N spectra, but was clearly unsatisfactory on noisy spectra (judging 
from repeated attempts at classification), especially on the bluer end of the selection range 
((B-V)o < 1.0). 
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3. Technique 

Below, we first remind the reader of the standard PCA technique, and then describe 
the variant of this method which was successfully used to discriminate between K giants 
and K dwarfs. 

To begin with, the spectra to be analyzed are shifted into their rest frames and binned 
linearly over a fixed wavelength range (4800-5500 A) into a fixed number of bins N (500). 
Each spectrum can thus be represented as a point in the iV-dimensional vector space of all 
possible (similarly binned) spectra. 

As an example, consider the set of rid standard dwarf spectra. This set can be 
represented as a cloud of n d points in the above vector space. The aim of the standard PCA 
classification scheme is to concentrate the information in the rid N- dimensional points into a 
set of q (q < rid) orthogonal iV-dimensional vectors which are able to describe "dwarf-ness" 
to good approximation (in a least squares sense). The largest of these vectors ai is the 
direction along which the cloud of dwarf stars is most elongated, that is, the direction 
of a least squares line fit to the dwarf points that passes though the mean point (mean 
spectrum). This is the first order least squares description of the data. The variation of 
spectra in the direction ai is the greatest in the data set, so it is removed by collapsing the 
cloud of points along ai to give a new data-set of dimension N — 1. The second order least 
squares description a 2 is calculated from this new data set in the same way as ai was from 
the original set. This process is iterated so that the ith principal component is calculated 
from a data-set formed by successively collapsing the original data-set along the 1st to the 
{i — l)th principal component directions. The maximum number q max of such vectors that 
can be found is either N (there are only N dimensions to collapse the data-set into) or rid 
(when all points lie exactly along the {q max = ^d)th principal component). If dwarf star 
spectra have regular patterns, the cloud of points in the vector space will be localized, so 
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we expect to be able to account for most of the variance in the sample with a small number 
of principal components and the aim of the operation will have been fulfilled. 

It can easily be shown ( e.g., Francis 1991) that this process is equivalent to finding 
the eigenvectors corresponding to the largest eigenvalues of the matrix 



where H denotes Hermitian conjugate, and x fc is the fcth sample vector. 

The problem that needs to be addressed however, is how to discriminate between 
classes of spectra (or clouds of points in the vector space of possible spectra). The variant 
of PCA employed here does not deconstruct a single set of spectra (cloud of points) as 
above, but instead deconstructs the set of difference vectors between points of different 
classes (again in a least squares sense) (see e.g., Ullman 1973 ). We will denote xj^ as 
the kth sample vector of class /i. Since the mean spectrum x contains no discriminatory 
information, we first subtract x from all the xj^: this does not affect discrimination and 
avoids the problem of the mean spectrum dominating the covariance matrix (Equation 10 
below) which can make the eigenvector equation (Equation 9) unsuitable for solution with 
simple numerical algorithms. 

Define a linear transformation A, such that 



where yjj. is to be set up such that it contains the maximum amount of discriminatory 
information based on a least-mean-square representation of all the difference vectors between 
the sets. Let be the vector elements of the transformation matrix A = (ai, . . . , a^, . . . , slm), 
where M is a fixed number of elements less than or equal to N. We therefore seek unit 
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vectors that maximize the quantity 

* 2 = E (: 



yir'-yf"). 



The requirement that the a^ be unit vectors imposes the constraint (aj) H aj = 1 
using Lagrange multipliers, we may write: 
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where Aj is the Lagrange multiplier for aj. Substituting for from Equation 2 
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Differentiating: 



k,l,fj,,fj,' 



a.i = 0, 



or: 



[C - A J] a, = 0. 

Therefore the aj are eigenvectors of the Hermitian matrix C: 



C= £ (x^-x^'-x^)". 
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which is simply the covariance matrix of the difference vectors. The eigenvectors 

ai, . . . , a*, . . . , a M of the covariance matrix define the linear transformation A (Equation 2 

above). 

Substituting Equation 9 into Equation 3 and using the orthogonality property of the 
eigenvectors, we find 

rn 

s 2 = E^, (ii) 

where m is the no. of eigenvectors used. Clearly the larger the eigenvector the larger the 
discriminating power of the corresponding eigenvector. Therefore, the M eigenvectors with 
the largest eigenvalues give the best M-dimensional linear discrimination between classes fj, 
and //. 



4. Results 

We first find the mean spectrum of the 48 standard stars displayed in Figure 1. The 
covariance matrix C (Equation 10) is formed using the mean-corrected standard star 
spectra. The eigenvectors a^ and related eigenvalues Aj of the covariance matrix C are then 
calculated. As explained above, the eigenvectors corresponding to the largest eigenvalues 
are those which contain the greatest amount of discriminatory information: the ten largest 
eigenvalues are given in Table 2. From the table we see that the first order discriminating 
vector (that with the largest eigenvalue) accounts for a large fraction — 50% — of the 
total discrimination, with the next nine contributing only a further 35%. We therefore 
investigate whether the first order eigenvector is sufficient to allow distinction between K 
dwarfs and K giants. This eigenvector is shown in Figure 2: several spectral features of K 
stars are visible, notably the Mg'b' feature, the MgH band at 5205A and some prominent 
Fe lines. 
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For each star spectrum x that is to be classified, a coefficient c = (x — s) ■ e is calculated 
from the eigenvector e (shown in Figure 2) and the average standard star spectrum s. 
Figure 3 shows the relation between (B — V) and c calculated for the standard stars; 
the 'stars' in the diagram represent the giants in the sample of standards, the large dots 
represent dwarf stars and the small dots represent subdwarfs. Superimposed is a straight 
line drawn by eye marking the boundary between giants and dwarfs. For (B — V)o > 1.1, 
the classification scheme appears to work well. For (B — V) £ 1.0, the classification is less 
clear cut (but it is also very difficult to classify these hotter stars visually — cf. Section 2). 

Figure 4 shows the same plot for the survey stars, where the (B — V) values have been 
taken from the calibrated APM photometry (cf. Ibata & Gilmore 1995a). 

We now estimate the accuracy of this classification scheme. The effect of photon noise 
on classification is investigated by degrading the standard star spectra using a Poisson 
random number generator by Press et al. (1986) . We find an rms error in the coefficient 
c of ~ 10 when the signal to noise is degraded to S/N ps 5 (for the standard stars, c takes 
values —100 £ c < 100). A very much larger source of error in the classification of survey 
stars arises simply from the rms color error (B — V) w 0.18 of these data (Ibata & Gilmore 
1995a ). Assuming that each point in Figure 4 has a probability density that is a Gaussian 
distribution with a = 0.18 along the (B — V) direction, we find that ~ 15% of giants and 
ps 25% of dwarfs are on average misidentified. 

Comparing the results of automated classification to that performed visually (cf. 
Section 2), we find that ps 11% of all dwarfs classified visually are classified differently 
by the PCA algorithm (ps 5% for high signal to noise spectra with (B — V) > 1.1), while 
ps 13% of giants classified visually are classified differently by the algorithm (ps 8% for 
high signal to noise spectra with (B — V) > 1.1). (By high signal to noise spectra we mean 
approximately the quarter of the survey sample of which we were most confident of the 
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visual classification). We cannot easily quantify the relative precision between the automatic 
and visual techniques since it is non-trivial to organize a controlled experiment on humans. 
However, in the few cases where an inter- and intra-comparison between classification by 
human experts and an automated algorithm has been carried out for related problems ( 
e.g., Nairn et al. 1995, Lahav et al. 1995 for galaxy morphology classification), the scatter 
between different human experts was found to be non-neglible and indeed comparable to 
the error from the automatic technique. What is clear from our tests is that repeated 
human attempts at visual classification of low signal to noise (S/N 10) spectra are much 
less reliable than the machine based approach (although again it is difficult to quantify this 
statement). 

5. Conclusions 

We discussed a variant of the Principal Component Analysis technique, which is 
designed to discriminate between classes of objects. This technique provides the best 
possible linear discrimination. It is well suited to astronomical problems involving the 
discrimination of spectra. We show that it is very simple to implement this technique on the 
problem of distinguishing K dwarfs from K giants using spectra sampled in the wavelength 
range 4800 to 5300A at l.bA resolution. In principle, with very accurate (B — V) 
photometry, and with very high signal to noise spectra (say, S/N £ 30) it is possible to 
discriminate visually between K giants and K dwarfs to high accuracy. However, the K star 
sample investigated above had poor photometry (S(B — V) ~ 0.2), and many spectra had 
S/N ^ 10 (for which repeated attempts at visual discrimination gave different results). 
The numerical algorithm developed is able to reproduce visual discrimination of the highest 
signal to noise spectra (S/N £ 20) to approximately 90 - 95% (and helped to pick out 
stars which, with hindsight, had been obviously misclassified) . According to numerical 
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experiments, in which standard star spectra had their signal to noise ratio degraded 
artificially to S/N « 5 (about the lowest S/N spectrum obtained), the algorithm works 
well with poor quality spectra. The machine discrimination is reliable (the discrimination 
criteria remain fixed) and is much more accurate than visual discrimination on low signal 
to noise spectra. 
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Fig. 1. — The grid of spectral standard stars. Each star is labeled with a classification 
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Fig. 2. — The most discriminating vector between K dwarfs and K giants (that corresponding 
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Fig. 3. — The color dependence of the coefficient c (defined in the text) with (B — V) for 
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Table 1. The grid of standard stars 



No. 


Class 


B - 


- V 


star 


No. 


Class 


B 


- V 


star 


[Fe/H] 


1 


subdwarf 





85 
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25 
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82 


HD 191179 




2 


subdwarf 





94 


BD-0°4234 


26 
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85 


HD 201195 




3 


subdwarf 


1 


00 


LFT 466 


27 
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88 


HD 203066 




4 


subdwarf 


1 


00 
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28 


giant 





90 
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-2.60 


5 


subdwarf 


1 


11 
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29 
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92 


HD 171391 




6 


subdwarf 


1 


26 


LFT 1668 


30 
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96 


HD 192246 




7 
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68 


HR 72 


31 


giant 


1 


00 


HR 3994 


0.22 


8 
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87 


HR 7703 


32 
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1 


05 
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9 
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88 


HR 487 


33 


giant 


1 


06 


HD 157457 




10 


dwarf 





95 


HR+21°3245 


34 


giant 


1 


10 


HR 4287 


-0.06 


11 


dwarf 


1 


00 


HR 8382 


35 


giant 


1 


10 


HR 8841 


-0.13 


12 


dwarf 


1 


06 


HR 8387 


36 


giant 


1 


10 


HR 8924 


0.55 


13 


dwarf 


1 


06 


BD+10°3665 


38 


giant 


1 


13 


HR 7430 


-0.70 


14 


dwarf 


1 


13 


BD+22°3406 


37 


giant 


1 


13 


HD 211475 




15 


dwarf 


1 


22 


BD+6°4741 


39 


giant 


1 


16 


HD 107328 


-0.47 


16 


Hyades 





75 


vB 69 


40 


giant 


1 


18 


HD 110184 


-2.50 


17 
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84 
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41 
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1 


22 
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18 
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86 
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42 
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1 


23 
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19 
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43 
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1 


23 


HR 5370 
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20 
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44 
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1 


26 
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21 


Hyades 


1 
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45 
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1 


27 
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46 
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1 


37 
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-0.11 


23 
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47 


giant 


1 
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0.35 


24 
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1 


20 
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48 
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1 


50 


HR 0224 
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Table 2. The ten largest eigenvalues of the covariance matrix of standard stars. 



No. A % of trace cum. % of trace 
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8 
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83.214 


9 
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84.277 


10 


146001.9 


1.042 


85.319 
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Table 1. The ten largest eigenvalues of the covariance matrix of standard stars. 



No. 


A 


% of trace 


cum. % of trace 


1 


7027827.5 


50.145 


50.145 


2 


1836905.6 


13.107 


63.252 


3 


1277226.4 


9.113 


72.365 


4 


606478.0 


4.327 


76.692 


■5 


295777.4 


2.110 


78.803 


6 


251045.7 


1.791 


80.594 


7 


207668.2 


1.482 


82.076 


8 


159497.0 


1.138 


83.214 


9 


149020.3 


1.063 


84.277 


10 


146001.9 


1.042 


85.319 
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